Does a heart transplantation (treatment: A) cause survival (outcome: Y) ?
Causal Effect: \(Y_{i}^{a=1} \neq Y_{i}^{a=0}\)
\(Y^{a=1}\) and \(Y^{a=0}\) are Potential Outcomes or Counterfactual Outcomes
“Rubin Causal Model” (Holland, 1986) framework:
Unit: The person, place, or thing upon which a treatment will operate, at a particular time
Treatment: An intervention (Pearl: “we do …”), the effects of which the investigator wishes to assess relative to no intervention
Potential Outcomes: The values of a unit’s measurement of interest after application of the treatment and non-application of the treatment
The Fundamental Problem of Causal Inference: We can observe at most one of the potential outcomes for each unit.
Definitions from “Basic Concepts of Statistical Inference ofr Causal Effects in Experiments and Observational Studies”; Donald Rubin; Harvard University; 2005
Formal definition of the average causal effect in the population: \(Pr[Y^{a=0}=1] \neq Pr[Y^{a=1}=1]\)
In our example - the null hypothesis of no average causal effect is true: \(Pr[Y^{a=0}=1] = 10/20 = Pr[Y^{a=1}=1] = 10/20\)
“Rubin Causal Model” (Holland, 1986) framework:
quote: “… the only way, we are told, that physicians can understand probabilities: odds being a difficult concept only comprehensible to statisticians, bookies, puntersa and readers of the sports pages of popular newspapers…”; Statistical Issues in Drug Development; Stephen Senn; Wiley; 2007
\(1^{st}\) source of random error: sampling variability We assume consistent estimators \(\widehat{Pr}[Y^a=1]\), but we need statitical procedures to determine the uncertainty of seeing only a limited number of individuals of our population.
\(2^{nd}\) source of random error: nondeterministic counterfactuals We might look at an inherent stochastic process (eg. quantum mechanics, gene expression, …) - chapter 10.
\[Pr[Y=1|A=1]=7/13\] \[Pr[Y=1|A=0]=3/7\] We conclude that in our population treatment A and outcome Y are dependent / associated because \(Pr[Y=1|A=1] \neq Pr[Y=1|A=0]\).
Independence \(Y\perp \!\!\! \perp A\) is defined as \(Pr[Y=1|A=1] = Pr[Y=1|A=0]\). —
Background: We tested the primary hypothesis that breast cancer recurrence after potentially curative surgery is lower with regional anaesthesia-analgesia using paravertebral blocks and the anaesthetic propofol than with general anaesthesia with the volatile anaesthetic sevoflurane and opioid analgesia.
Methods: We did a randomised controlled trial at 13 hospitals in Argentina, Austria, … Primary analyses were done under intention-to-treat principles.
Findings: Between Jan 30, 2007, and Jan 18, 2018, 2132 women were enrolled to the study, … Baseline characteristics were well balanced between study groups. Among women assigned regional anaesthesia-analgesia, 102 (10%) recurrences were reported, compared with 111 (10%) recurrences among those allocated general anaesthesia (hazard ratio 0·97, 95% CI 0·74–1·28; p=0·84).
Interpretation: In our study population, regional anaesthesia-analgesia (paravertebral block and propofol) did not reduce breast cancer recurrence after potentially curative surgery.. Clinicians can use regional or general anaesthesia with respect to breast cancer recurrence …
On his talk about about the influence of causal inference on recommender systems (Amit Sharma; DataEngConf NYC 2016):
Social networks like the “social-news-aggregator” platform reddit (1.65 billion comments until May 2015) provide a relevant communications platform. Therefore understanding evolution of its user community is essential for monitoring community health, predicting individual user trajectories and supporting effective reccommendations. Comment lenght served in the study of Barbosa et.al. (arXiv; 2015) as proxy for user effort to answer the question: “is reddit getting worse over time?”.
“A large university is interested in investigating the effects on the students of the diet provided in the university dining halls and any sex differences in these effects. Various types of data are gathered. In particular, the weight of each student at the time of his arrival in September and his weight the following June are recorded.” (Lord; 1967)
Figure from “Lord’s Paradox Revisited - Oh Lord! Kumbaya!”; Judea Pearl; 2016
Figure source: Verena Wally et.al.; Diacerein orphan drug development for epidermolysis bullosa simplex: A phase 2/3 randomized, placebo-controlled, double-blind clinical trial.; J Am Acad Dermatol.; 2018
Figure source: Sonia Hernández-Díaz et.al.; The Birth Weight “Paradox” Uncovered?; Americ. Journ. Epidem.; 2006
Birth weight paradox: “The birth-weight paradox concerns the relationship between the birth weight and mortality rate of childrenborn to tobacco smoking mothers. It is dubbed a “paradox” because, contrary to expectations, low birth-weight children born to smoking mothers have a lower infant mortality rate than the low birth weightchildren of non-smokers" (“Lord’s Paradox Revisited - Oh Lord! Kumbaya!”; Judea Pearl; 2016)
Mendelian randomization “(MR) is a burgeoning field that involves the use of genetic variants to assess causal relationships between exposures and outcomes. MR studies can be straightforward; for example, genetic variants within or near the encoding locus that is associated with protein concentrations can help to assess their causal role in disease.”; (Holmes et.al.; Mendelian randomization in cardiometabolic disease: challenges in evaluating causality.;Nat.Rev.Cardiology;2017)